Escambia County
- Asia > China (0.14)
- North America > United States > California (0.05)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- (2 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
CryptoQA: A Large-scale Question-answering Dataset for AI-assisted Cryptography
Elfares, Mayar, Reisert, Pascal, Dietz, Tilman, Barman, Manpa, Zaki, Ahmed, Küsters, Ralf, Bulling, Andreas
Large language models (LLMs) excel at many general-purpose natural language processing tasks. However, their ability to perform deep reasoning and mathematical analysis, particularly for complex tasks as required in cryptography, remains poorly understood, largely due to the lack of suitable data for evaluation and training. To address this gap, we present CryptoQA, the first large-scale question-answering (QA) dataset specifically designed for cryptography. CryptoQA contains over two million QA pairs drawn from curated academic sources, along with contextual metadata that can be used to test the cryptographic capabilities of LLMs and to train new LLMs on cryptographic tasks. We benchmark 15 state-of-the-art LLMs on CryptoQA, evaluating their factual accuracy, mathematical reasoning, consistency, referencing, backward reasoning, and robustness to adversarial samples. In addition to quantitative metrics, we provide expert reviews that qualitatively assess model outputs and establish a gold-standard baseline. Our results reveal significant performance deficits of LLMs, particularly on tasks that require formal reasoning and precise mathematical knowledge. This shows the urgent need for LLM assistants tailored to cryptography research and development. We demonstrate that, by using CryptoQA, LLMs can be fine-tuned to exhibit better performance on cryptographic tasks.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Illinois (0.04)
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- (5 more...)
Improving LLM-based Ontology Matching with fine-tuning on synthetic data
Sousa, Guilherme, Lima, Rinaldo, Trojahn, Cassia
Large Language Models (LLMs) are increasingly being integrated into various components of Ontology Matching pipelines. This paper investigates the capability of LLMs to perform ontology matching directly on ontology modules and generate the corresponding alignments. Furthermore, it is explored how a dedicated fine-tuning strategy can enhance the model's matching performance in a zero-shot setting. The proposed method incorporates a search space reduction technique to select relevant subsets from both source and target ontologies, which are then used to automatically construct prompts. Recognizing the scarcity of reference alignments for training, a novel LLM-based approach is introduced for generating a synthetic dataset. This process creates a corpus of ontology submodule pairs and their corresponding reference alignments, specifically designed to fine-tune an LLM for the ontology matching task. The proposed approach was evaluated on the Conference, Geolink, Enslaved, Taxon, and Hydrography datasets from the OAEI complex track. The results demonstrate that the LLM fine-tuned on the synthetically generated data exhibits superior performance compared to the non-fine-tuned base model. The key contribution is a strategy that combines automatic dataset generation with fine-tuning to effectively adapt LLMs for ontology matching tasks.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.76)
- Europe > Greece > Central Macedonia > Thessaloniki (0.05)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Estimating Visceral Adiposity from Wrist-Worn Accelerometry
Williamson, James R., Alini, Andrew, Telfer, Brian A., Potter, Adam W., Friedl, Karl E.
Visceral adipose tissue (VAT) is a key marker of both metabolic health and habitual physical activity (PA). Excess VAT is highly correlated with type 2 diabetes and insulin resistance. The mechanistic basis for this pathophysiology relates to overloading the liver with fatty acids. VAT is also a highly labile fat depot, with increased turnover stimulated by catecholamines during exercise. VAT can be measured with sophisticated imaging technologies, but can also be inferred directly from PA. We tested this relationship using National Health and Nutrition Examination Survey (NHANES) data from 2011-2014, for individuals aged 20-60 years with 7 days of accelerometry data (n=2,456 men; 2,427 women) [1]. Two approaches were used for estimating VAT from activity. The first used engineered features based on movements during gait and sleep, and then ridge regression to map summary statistics of these features into a VAT estimate. The second approach used deep neural networks trained on 24 hours of continuous accelerometry. A foundation model first mapped each 10s frame into a high-dimensional feature vector. A transformer model then mapped each day's feature vector time series into a VAT estimate, which were averaged over multiple days. For both approaches, the most accurate estimates were obtained with the addition of covariate information about subject demographics and body measurements. The best performance was obtained by combining the two approaches, resulting in VAT estimates with correlations of r=0.86. These findings demonstrate a strong relationship between PA and VAT and, by extension, between PA and metabolic health risks.
- North America > United States > Massachusetts > Middlesex County > Lexington (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- North America > United States > Massachusetts > Middlesex County > Waltham (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.87)
Evaluation of Machine and Deep Learning Techniques for Cyclone Trajectory Regression and Status Classification by Time Series Data
Lo, Ethan Zachary, Lo, Dan Chie-Tien
Abstract--Accurate cyclone forecasting is essential for minimizing loss of life, infrastructure damage, and economic disruption. Traditional numerical weather prediction models, though effective, are computationally intensive and prone to error due to the chaotic nature of atmospheric systems. This study proposes a machine learning (ML) approach to forecasting tropical cyclone trajectory and status using time series data from the National Hurricane Center, including recently added best track wind radii. A two-stage ML pipeline is developed: a regression model first predicts cyclone features--maximum wind speed, minimum pressure, trajectory length, and directional change--using a sliding window of historical data. These outputs are then input into classification models to predict the cyclone's categorical status. Gradient boosting regression and three classifiers--random forest (RF), support vector machine (SVM), and multi-layer perceptron (MLP)--are evaluated. After hyperparameter tuning and synthetic minority oversampling (SMOTE), the RF classifier achieves the highest performance with 93% accuracy, outperforming SVM and MLP across precision, recall, and F1 score. The RF model is particularly robust in identifying minority cyclone statuses and minimizing false negatives. Regression results yield low mean absolute errors, with pressure and wind predictions within 2.2 mb and 2.4 kt, respectively. These findings demonstrate that ML models, especially ensemble-based classifiers, offer an effective, scalable alternative to traditional forecasting methods, with potential for real-time cyclone prediction and integration into decision-support systems.
- North America > The Bahamas (0.14)
- North America > United States > Georgia > Cobb County > Marietta (0.04)
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- (2 more...)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.46)
Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute
Chung-En, null, Yu, null, Jalaian, Brian, Bastian, Nathaniel D.
Developing trustworthy intelligent vision systems for high-stakes domains, \emph{e.g.}, remote sensing and medical diagnosis, demands broad robustness without costly retraining. We propose \textbf{Visual Reasoning Agent (VRA)}, a training-free, agentic reasoning framework that wraps off-the-shelf vision-language models \emph{and} pure vision systems in a \emph{Think--Critique--Act} loop. While VRA incurs significant additional test-time computation, it achieves up to 40\% absolute accuracy gains on challenging visual reasoning benchmarks. Future work will optimize query routing and early stopping to reduce inference overhead while preserving reliability in vision tasks.
ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models
Yu, Chung-En Johnny, Hsuan-Chih, null, Chen, null, Jalaian, Brian, Bastian, Nathaniel D.
Large Vision-Language Models (LVLMs) exhibit strong multimodal capabilities but remain vulnerable to hallucinations from intrinsic errors and adversarial attacks from external exploitations, limiting their reliability in real-world applications. We present ORCA, an agentic reasoning framework that improves the factual accuracy and adversarial robustness of pretrained LVLMs through test-time structured inference reasoning with a suite of small vision models (less than 3B parameters). ORCA operates via an Observe--Reason--Critique--Act loop, querying multiple visual tools with evidential questions, validating cross-model inconsistencies, and refining predictions iteratively without access to model internals or retraining. ORCA also stores intermediate reasoning traces, which supports auditable decision-making. Though designed primarily to mitigate object-level hallucinations, ORCA also exhibits emergent adversarial robustness without requiring adversarial training or defense mechanisms. We evaluate ORCA across three settings: (1) clean images on hallucination benchmarks, (2) adversarially perturbed images without defense, and (3) adversarially perturbed images with defense applied. On the POPE hallucination benchmark, ORCA improves standalone LVLM performance by +3.64\% to +40.67\% across different subsets. Under adversarial perturbations on POPE, ORCA achieves an average accuracy gain of +20.11\% across LVLMs. When combined with defense techniques on adversarially perturbed AMBER images, ORCA further improves standalone LVLM performance, with gains ranging from +1.20\% to +48.00\% across evaluation metrics. These results demonstrate that ORCA offers a promising path toward building more reliable and robust multimodal systems.
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- Asia > China (0.04)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
Zhu, Changtai, Wang, Siyin, Feng, Ruijun, Song, Kai, Qiu, Xipeng
Conversational search systems require effective handling of context-dependent queries that often contain ambiguity, omission, and coreference. Conversational Query Reformulation (CQR) addresses this challenge by transforming these queries into self-contained forms suitable for off-the-shelf retrievers. However, existing CQR approaches suffer from two critical constraints: high dependency on costly external supervision from human annotations or large language models, and insufficient alignment between the rewriting model and downstream retrievers. We present ConvSearch-R1, the first self-driven framework that completely eliminates dependency on external rewrite supervision by leveraging reinforcement learning to optimize reformulation directly through retrieval signals. Our novel two-stage approach combines Self-Driven Policy Warm-Up to address the cold-start problem through retrieval-guided self-distillation, followed by Retrieval-Guided Reinforcement Learning with a specially designed rank-incentive reward shaping mechanism that addresses the sparsity issue in conventional retrieval metrics. Extensive experiments on TopiOCQA and QReCC datasets demonstrate that ConvSearch-R1 significantly outperforms previous state-of-the-art methods, achieving over 10% improvement on the challenging TopiOCQA dataset while using smaller 3B parameter models without any external supervision.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- (13 more...)
Neurosymbolic AI Transfer Learning Improves Network Intrusion Detection
Tran, Huynh T. T., Sander, Jacob, Cohen, Achraf, Jalaian, Brian, Bastian, Nathaniel D.
Transfer learning is commonly utilized in various fields such as computer vision, natural language processing, and medical imaging due to its impressive capability to address subtasks and work with different datasets. However, its application in cybersecurity has not been thoroughly explored. In this paper, we present an innovative neurosymbolic AI framework designed for network intrusion detection systems, which play a crucial role in combating malicious activities in cybersecurity. Our framework leverages transfer learning and uncertainty quantification. The findings indicate that transfer learning models, trained on large and well-structured datasets, outperform neural-based models that rely on smaller datasets, paving the way for a new era in cybersecurity solutions.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- (6 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military > Cyberwarfare (0.76)
DecMetrics: Structured Claim Decomposition Scoring for Factually Consistent LLM Outputs
Claim decomposition plays a crucial role in the fact-checking process by breaking down complex claims into simpler atomic components and identifying their unfactual elements. Despite its importance, current research primarily focuses on generative methods for decomposition, with insufficient emphasis on evaluating the quality of these decomposed atomic claims. To bridge this gap, we introduce \textbf{DecMetrics}, which comprises three new metrics: \texttt{COMPLETENESS}, \texttt{CORRECTNESS}, and \texttt{SEMANTIC ENTROPY}, designed to automatically assess the quality of claims produced by decomposition models. Utilizing these metrics, we develop a lightweight claim decomposition model, optimizing its performance through the integration of these metrics as a reward function. Through automatic evaluation, our approach aims to set a benchmark for claim decomposition, enhancing both the reliability and effectiveness of fact-checking systems.
- North America > United States > Florida > Escambia County > Pensacola (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.05)
- Europe > Sweden (0.04)
- (11 more...)